You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Added comprehensive support for the Dremio data warehouse across workflows, macros, and integration tests.
Introduced macros for Dremio-specific SQL operations, including datetime handling, reserved keyword escaping, table operations, and standard deviation calculations.
Enhanced integration tests with Dremio Docker Compose setup and dynamic schema/database property resolution.
Improved handling of reserved SQL keywords in queries and test data.
Bug Fixes
Corrected aliasing and escaping of reserved keywords in SQL queries to prevent syntax errors.
Refined anomaly score logic to handle single-value training sets.
Refactor
Updated test data column naming conventions for clarity and consistency.
Streamlined macro and test implementations for better cross-database compatibility.
Chores
Updated .gitignore to exclude additional internal directories.
Expanded workflow test matrices to include Dremio.
* dremio support in anomaly tests - WIP
* git ignore dbt_internal_packages (relevant for fusion but no harm in adding now)
* get_package_database_and_schema - use non-views for dremio
* anomaly tests fixes
* remove space
* add newline
* dremio datediff - different implementation for seconds diff
🧰 Additional context used🧠 Learnings (2)📓 Common learnings
Learnt from: GuyEshdat
PR: elementary-data/dbt-data-reliability#838
File: macros/utils/table_operations/insert_rows.sql:156-0
Timestamp: 2025-08-10T11:28:43.591Z
Learning: In the Elementary dbt-data-reliability codebase, the `dremio__escape_special_chars` macro in `macros/utils/table_operations/insert_rows.sql` receives strings where single quotes are already escaped as `\'` from upstream processing. The macro correctly converts these to Dremio's SQL escaping format by replacing `\'` with `''`. This is different from other database implementations that handle bare single quotes directly.
📚 Learning: 2025-07-27T11:33:35.918Z
Learnt from: haritamar
PR: elementary-data/dbt-data-reliability#827
File: macros/utils/cross_db_utils/generate_elementary_profile_args.sql:30-31
Timestamp: 2025-07-27T11:33:35.918Z
Learning: In the Elementary dbt package, the Snowflake profile generation macro intentionally always includes private_key and private_key_passphrase parameters regardless of the authentication method. This is a strategic decision to deprecate password authentication and guide all new users toward keypair authentication, even if it means showing keypair parameters when they might not currently be needed.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
GitHub Check: test (latest_official, databricks_catalog) / test
GitHub Check: test (1.8.0, postgres) / test
GitHub Check: test (latest_official, trino) / test
GitHub Check: test (latest_official, athena) / test
GitHub Check: test (latest_official, redshift) / test
GitHub Check: test (latest_pre, postgres) / test
GitHub Check: test (latest_official, postgres) / test
GitHub Check: test (latest_official, snowflake) / test
GitHub Check: test (latest_official, bigquery) / test
GitHub Check: test (latest_official, clickhouse) / test
188-191: Confirm mapping of database/schema to Dremio source/folder.
This maps elementary_database -> object_storage and elementary_schema -> object_storage_path. Ensure elementary.get_package_database_and_schema() returns the Dremio source and folder path under Dremio, not conventional database/schema names.
If needed, I can adjust the Dremio dispatch of get_package_database_and_schema to return the correct pair.
186-193: No changes needed—Dremio Cloud fields are correct
The Dremio adapter for Cloud specifically exposes target.cloud_project_id and target.cloud_host (rather than generic project_id/host). In our macro we map those to the Elementary service parameter names "project_id" and "host", and the object storage parameters ("object_storage" and "object_storage_path") match the downstream profiling API. Everything aligns with both the Dremio adapter and Elementary’s expected arguments.
In macros/utils/cross_db_utils/generate_elementary_profile_args.sql around lines
184 to 194, the macro dremio__generate_elementary_profile_args is missing the
_parameter("type", target.type) entry that other adapter macros include. Add
_parameter("type", target.type) to the returned list to ensure consistency and
completeness of the generated profile arguments.
GRANT USAGE ON PROJECT TO USER "{{ parameters["user"] }}";
GRANTSELECTON ALL DATASETS IN FOLDER {% for part in (parameters["object_storage"] ~ "." ~ parameters["object_storage_path"]).split(".") %}"{{ part }}"{% if not loop.last %}.{% endif %}{% endfor %} TO USER "{{ parameters["user"] }}";
Unescaped double quotes in parameters["user"], ["object_storage"], or ["object_storage_path"] can break SQL.
If object_storage_path is empty, the split/join yields an empty quoted segment (""), producing invalid SQL.
When no path is provided, GRANT should likely target the source, not a folder.
Proposed fix replaces the block to safely escape quotes and support both source-only and folder paths.
-{% macro dremio__get_user_creation_query(parameters) %}-CREATE USER "{{ parameters["user"] }}";--GRANT USAGE ON PROJECT TO USER "{{ parameters["user"] }}";-GRANT SELECT ON ALL DATASETS IN FOLDER {% for part in (parameters["object_storage"] ~ "." ~ parameters["object_storage_path"]).split(".") %}"{{ part }}"{% if not loop.last %}.{% endif %}{% endfor %} TO USER "{{ parameters["user"] }}";-{% endmacro %}+{% macro dremio__get_user_creation_query(parameters) %}+{%- set user = parameters["user"] | replace('"', '""') -%}+{%- set source = parameters["object_storage"] | replace('"', '""') -%}+{%- set path = parameters.get("object_storage_path") -%}++CREATE USER "{{ user }}";++GRANT USAGE ON PROJECT TO USER "{{ user }}";+{%- if path %}+GRANT SELECT ON ALL DATASETS IN FOLDER "{{ source }}"{% for seg in path.split('.') %}."{{ seg | replace('"','""') }}"{% endfor %} TO USER "{{ user }}";+{%- else %}+GRANT SELECT ON ALL DATASETS IN "{{ source }}" TO USER "{{ user }}";+{%- endif %}+{% endmacro %}
📝 Committable suggestion
‼️IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
GRANT USAGE ON PROJECT TO USER "{{ parameters["user"] }}";
GRANTSELECTON ALL DATASETS IN FOLDER {% for part in (parameters["object_storage"] ~ "." ~ parameters["object_storage_path"]).split(".") %}"{{ part }}"{% if not loop.last %}.{% endif %}{% endfor %} TO USER "{{ parameters["user"] }}";
GRANTSELECTON ALL DATASETS IN FOLDER "{{ source }}"{% for seg inpath.split('.') %}."{{ seg | replace('"','""') }}"{% endfor %} TO USER "{{ user }}";
{%- else %}
GRANTSELECTON ALL DATASETS IN"{{ source }}" TO USER "{{ user }}";
{%- endif %}
{% endmacro %}
🤖 Prompt for AI Agents
In macros/utils/cross_db_utils/get_user_creation_query.sql around lines 109 to
114, the current macro does not properly escape double quotes in
parameters["user"], ["object_storage"], and ["object_storage_path"], which can
cause SQL injection or syntax errors. Also, if object_storage_path is empty, it
generates invalid SQL by including an empty quoted segment. To fix this,
implement proper escaping of double quotes in all identifiers and add logic to
check if object_storage_path is empty; if so, grant permissions on the source
only instead of a folder path. Adjust the templating to handle both cases
safely.
GRANT USAGE ON PROJECT TO USER "{{ parameters["user"] }}";
GRANTSELECTON ALL DATASETS IN FOLDER {% for part in (parameters["object_storage"] ~ "." ~ parameters["object_storage_path"]).split(".") %}"{{ part }}"{% if not loop.last %}.{% endif %}{% endfor %} TO USER "{{ parameters["user"] }}";
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Confirm Dremio SQL support and exact privilege syntax.
Validate that the following constructs are supported in your Dremio environment:
CREATE USER "";
GRANT USAGE ON PROJECT TO USER "";
GRANT SELECT ON ALL DATASETS IN FOLDER ... TO USER "";
When no folder path is provided, GRANT SELECT ON ALL DATASETS IN ""
If syntax differs between Dremio Cloud and Software, we may need conditional generation.
🌐 Web query:
Verify Dremio SQL support for:
1) Creating users via SQL (CREATE USER) in Dremio Cloud/Software and the exact syntax.
2) Granting USAGE ON PROJECT to a USER and the exact syntax.
3) Granting SELECT ON ALL DATASETS IN FOLDER "<source>"."<path>" and the alternative for source-level grants (without folder).
Note any differences between Dremio Cloud vs Software.
💡 Result:
Dremio provides SQL commands for user management and access control, with certain functionalities varying between Dremio Cloud and Dremio Software. Below are the details addressing your queries:
1) Creating Users via SQL (CREATE USER) in Dremio Cloud/Software and the Exact Syntax
Dremio Cloud: The CREATE USER command is supported, allowing administrators to create new users directly through SQL.
Note: The username must be enclosed in double quotes. (docs.dremio.com)
Dremio Software: The CREATE USER command is not supported. User management is typically performed through the Dremio Console or by using the Dremio REST API. (docs.dremio.com)
2) Granting USAGE ON PROJECT to a USER and the Exact Syntax
In Dremio, the USAGE privilege on a project allows a user to access and interact with the project's contents.
Syntax:
GRANT USAGE ON PROJECT "<project_name>" TO USER "<username>";
Note: Both project and username are enclosed in double quotes. (docs.dremio.com)
3) Granting SELECT ON ALL DATASETS IN FOLDER ""."" and the Alternative for Source-Level Grants (Without Folder)
Dremio allows granting privileges on datasets within a specific folder or source.
Granting SELECT on All Datasets in a Folder:
Syntax:
GRANTSELECTON ALL DATASETS IN FOLDER "<folder_path>" TO USER "<username>";
Example:
GRANTSELECTON ALL DATASETS IN FOLDER "Sales/2025" TO USER "[email protected]";
Note: The folder path and username are enclosed in double quotes. (docs.dremio.com)
Granting SELECT on All Datasets in a Source:
Syntax:
GRANTSELECTON ALL DATASETS IN SOURCE "<source_name>" TO USER "<username>";
Example:
GRANTSELECTON ALL DATASETS IN SOURCE "SalesData" TO USER "[email protected]";
Note: The source name and username are enclosed in double quotes. (docs.dremio.com)
Differences Between Dremio Cloud and Dremio Software:
User Creation: Dremio Cloud supports the CREATE USER SQL command, whereas Dremio Software requires user management through the Dremio Console or REST API.
Granting Privileges: The SQL syntax for granting privileges, such as USAGE on projects and SELECT on datasets, is consistent across both Dremio Cloud and Dremio Software.
For comprehensive details on user management and access control in Dremio, refer to the official documentation:
Update Dremio SQL syntax and handle Cloud vs Software differences
Target: macros/utils/cross_db_utils/get_user_creation_query.sql (lines 112–114)
– In Dremio Cloud only, you can create users via CREATE USER "<username>";
Dremio Software does not support CREATE USER in SQL—you must call the REST API or use the UI. Add a flag or document this difference.
– GRANT USAGE must include the project name:
Replace sql GRANT USAGE ON PROJECT TO USER "{{ parameters.user }}";
with sql GRANT USAGE ON PROJECT "{{ parameters.project }}" TO USER "{{ parameters.user }}";
– GRANT SELECT ON ALL DATASETS IN FOLDER expects a single quoted, slash-separated path, not a series of dot-quoted identifiers. For example: sql GRANT SELECT ON ALL DATASETS IN FOLDER "{{ parameters.object_storage }}/{{ parameters.object_storage_path }}" TO USER "{{ parameters.user }}";
– When there’s no folder path, use a source-level grant: sql GRANT SELECT ON ALL DATASETS IN SOURCE "{{ parameters.object_storage }}" TO USER "{{ parameters.user }}";
Please update the macro to reflect these exact syntaxes and add conditional logic for Dremio Software’s lack of SQL user-creation.
🤖 Prompt for AI Agents
In macros/utils/cross_db_utils/get_user_creation_query.sql around lines 112 to
114, update the GRANT USAGE statement to include the project name by changing it
to GRANT USAGE ON PROJECT "{{ parameters.project }}" TO USER "{{ parameters.user
}}". Replace the GRANT SELECT ON ALL DATASETS IN FOLDER statement to use a
single quoted, slash-separated path "{{ parameters.object_storage }}/{{
parameters.object_storage_path }}" instead of dot-quoted identifiers. Add
conditional logic to grant SELECT ON ALL DATASETS IN SOURCE "{{
parameters.object_storage }}" TO USER "{{ parameters.user }}" when there is no
folder path. Also, add a flag or conditional to only include CREATE USER
"<username>"; in Dremio Cloud, omitting it for Dremio Software since it does not
support SQL user creation.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Chores
.gitignoreto exclude additional internal directories.